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FOREWORD 


This document outlines the causes and results of a study 
of multispectral scanner imagery to compare the advantages of 
using high-gain ( 3X ) instead of normal-gain (IX) data from the 
Land Satellites of the Earth Resources Program for the recog- 
nition of vegetation for agricultural applications. The Earth 
Observations Division, Science and Applications ; iroctorate, 
of the Lyndon B. Johnson Space Center, National Aeronautics 
and Space Administration, directed the study. Dr. Stanton Yao 
of Lockheed Electronics Company, Inc. , Aerospace Systems 
Division, Houston, Texas, coordinated the effort under 
Contruc* NAS ‘i-lJJOO and documented che analysis and results 
in this report. 
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1.0 INTRODUCTION 


Trior to the launch of the second Land Satellite 
(Landsat-2) the Earth Observations Division (EOD) , Science 
and Applications Directorate iSiAD) , of the Lyndon D. Johnson 
Space Center (J SC), National Aeronautics and Space Administra- 
tion (NASA) undertook a study of multispectral scanner (MSS) 
data from the first Land Satellite [Landsat-1, formerly called 
the Earth Resources Technology Satellite (ERTS-1)]. The pur- 
pose of the study was to compare the advantages of using the 
high-gain ( 3X> data from MSS bands 4 and 5 as opposed to 
normal-gain llX) data for agricultural applications such as 
the Large Area Crop Inventory Experiment (LACIE) . 

The research involved obtaining ground truth data coin- 
cidentally with high-gain MSS data from Landsat-1 covering 
sites in Imperial Valley, California, on December l 4 * and 20, 
10 ^ 4 . To avoid site dependence, additional high-gain data 
were gathered over intensive test sites (ITS's) in Kansas on 
December 27, 28, and 29, 1974. The Landsat-1 MSS data were 
collected by the Goddard Space Flight Center (GSFC) and 
shipped to JSC for analysis. 

1.1 RATIONALE DEHIND THE STUDY 

Most agricultural crops of interest to the LACIE project 
manifest themselves with reflectances that occupy the lower 
register of the scales in the visible bands (Landsat MSS 
bands 4 and 5) during certain times in the growing seasons. 
Both the sensitivity and the dynamic range of the MSS output 
would increase using high-gain data with the possible satu- 
ration of high-ref lectance substances such as snow, clouds, 



and bare soil. The increase in sensitivity and in the dynamic 
range of the data would mean that, for pattern recognition 
purposes, finer discriminant boundaries for overlapping regions 
could be dofined in measurement space. Thus, it was hypothe- 
sized that crop identification accuracies could be improved 
with high-gain data. 


1 . 2 BACKGROUND 

The only analysis of high-gain data known by this author 
was conducted at the Environmental Research Institute of 
Michigan (TRIM) (ref. 1). However, this research was not 
intended to improve classification accuracy but was focused 
on the saturation characteristics of the high-gain data. 

The report indicated there were numerous "holes" in the 
histogram of the two visible bands. This was because the 
high-gain data was not calibrated at GSFC, the same problem 
which initially hampered the JSC analysis. 

At the time the FOD high-gain study commenced, the goal 
was to provide some quick results in order to qualify the 
anticipated sensitivity requirements of Landsat-2 data for 
LAC IE applications . The study was to last only a few weeks; 
however, no calibrated high-gain data was available until 
March 31, 1975 (after the launch of Landsat-2) . Thus, pre- 
liminary conclusions were reached based strictly on the 
results of the uncalibrated data initially received from 
GSFC. Those results indicated that high-gain data provided 
no significant classification accuracy improvements over 
normal-gain data. The negative tone of this conclusion 
resulted in a downgraded priority for this study, which in 
turn delayed calibration of the data by GFFC for 3 months. 



Fortunately, there was no significant change in the results 
of the analysis when calibrated data were used. 



:.0 OBJECTIVES 


The two main objective* of the FOP study of the hiuh- 
ga in data were: 

a. To determine the variation* of agricultural crop reflec- 
tances as a function of conditions such as growing season, 
latitude, coop species, and atmospheric conditions, based 
on an aggregated Landsat-1 MSS analysis previously per- 
formed at JSC. In other words, the saturation character- 
istics of the Landsat-1 data were to be categorised so 
that, in the event the high-gain data option was exercised 
for LACIE appl ications , a strategy as to when and where 

to use it could be derived. 

b. To demonstrate whether or not, under appropriate condi- 
tions and using both the supervised and unsupervised 
classification approach, the hioh-uam data would yield 
improved classif ication accuracy and proportional estima- 
tion tor agricultural applicati ns when compared to th* 
normal-gain data. 



3.0 Arr ROACH 


To satisfy the first objective of the study, temporally 
registered Landsat-l data over Hill County, Montana (high 
latitude), Rice County, Kansas (medium latitude), and the 
Texas Panhandle (low latitude) were obtained with and with- 
out Sun angle correction. Variations in reflectances of 
individual crop species from some selected fields throughout 
the growing season in the Hill County site were also obtained. 
Figures 1 through 4 are plots of the reflectance counts of 
MSS bands 4 and S as a function ot growing season in Hill 
County; figures 1 and - use data from six winter wheat and 
two spring wheat fields, whereas figures 3 and 4 use aver- 
aged data from seven different crops. Figure 5 shows plots 
of Finney County, Kansas, data, whereas figure «? shows the 
I’exas Panhandle data, all as a function of growing season. 

To satisfy the second objective, the following original 
experimental design was utilized. 

It was determined that the Imperial Valley, an agricul- 
tural area usually under clear skies, would be in the over- 
lap region ot Landsat-1 MSS images on both December 1° and 
December 20 , 1$"4. It was anticipated that on December 1° 
data would be taken under the high-gam option and that the 
next day normal-gain data would be taken for comparison. A 
ground truth team was dispatched to the test site on those 
two dates to take atmospheric transmittance readings and to 
identify crops and secure crop growth information. However, 
on December 1°, 1?'4, the data taken over the Imperial Valley 
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Figure 4.— Mean class reflectance for various classes of 
vegetation in Hill County (MSS band 5) . 
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was considered uns u cLsz actory because of smog and haze. The 
GSFC then proceeded to gather high-gain data on December 20, 
1974; as a result, no normal-gain data were collected for 
comparison. The JSC then devised a method of reducing the 
high-gain data mathematically to simulate the normal-gain 
data, as shown in appendix A. Originally, the reduction 
scheme included a 2X option, but the aforementioned scheme 
could not simulate 2X data, and this option was excluded 
from the analysis. The F-matrix multiplication option, 
which is part of the classification rystem (LARSYS) of the 
Laboratory for Applications of Remote Sensing of Purdue 
University (LARS) , was modified to accomplish the reduction 
technique. (See appendix B for the deck setup used for 
processing these data.) The high- and normal-gain data 
tapes from both December 19 and December 20 imagery were 
then prepared for processing on the Earth Resources Inter- 
active Processing System (ERIPS) . The method consisted of 
the following steps: 

• The December 19 high- and normal-gain data were both 
registered to the December 20 imagery so that a simple 
set of field definition coordinates could be used on all 
four image sets (see appendix C) . 

• Training and test fields were defined and statistics were 
computed for six classes of interest (wheat, cotton, 
alfalfa, sugar beets, lettuce, and bare soil). The 
standard maximum likelihood classifier with assumed 
equal a priori probabilities was used to classify the 
test area of interest. 

• All four sets of imagery were subjected to a detailed 
clustering analysis on the ERIPS. An approach which 
generates class statistics by reading in the even lines 


J.2 



of the image and using them to classify the odd lines of 
of the image, and vice versa, was planned. 

• Color images created from both the high- and normal-gain 
versions were subjected to the scrutiny of an analyst- 
interpreter. The addition of human judgment, along with 
machine processing, for the analysis and comparison of 
the high- and normal-gain data completed all the goals 
stated in section 2b. 
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4.0 ANALYSIS AND INTERPRETATION OF RESULTS 


Because of the poor quality of the calibrated data and 
time allocation problems on the ERIPS, the analysis has not 
progressed as planned and certain study areas have not been 
investigated. In addition, as more data were examined, new 
areas of interest were disclosed. Specific areas of followup 
analysis are recommended in section 5. 

The major result of this investigation of high-gain 
versus normal-gain Landsat .VS S data was not anticipated; 
that is, the use of high-gain data with its inherent better 
sensitivity and dynamic range in MSS bands 4 and 5 does 
not significantly improve Landsat performance for LAC IE appli- 
cations within the context of the stated objectives. The 
following points support this finding. 

a. The comparison of calibrated and uncalibrated data in 
table I indicates that any improvement in the classifi- 
cation accuracy of high-gain over simulated normal-gain 
data is negligible for the six major crop classes con- 
sidered. The same conclusion was reached when 12 instead 
of 6 classes were used in the classification or when 
different a priori probabilities were assigned. To test 
possible site dependence, one of the Kansas high-gain 
tapes (Finney County on December 28, 1974) with ques- 
tionable ground truth data was subjected to the same 
analysis procedure used for the Imperial Valley data. 

The identical conclusion was reached. Therefore, insofar 
as the techniques of supervised pattern recognition are 
concerned, the high-gain data appears to offer no advan- 
tage. It is noteworthy that the statistics in table I 
indicate that calibration of the December 19 hazy data 
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TABLE I.- PERCENTAGE OF CLASSIFICATION ACCURACY COMPARISON FOP IKPEPIAL VALLEY 
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greatly improved classification accuracy whereas cali- 
bration had little effect on the clear-day data of 
December 20. However, the relationship between the high- 
and the nornal-gain data remained consistent. A ques- 
tionable point is the possible results that could be 
obtained if true instead of simulated normal-gain data 
were used. Since the two cannot be obtained simulta- 
neously, the effects caused by gain changes and contrib- 
uting temporal factors must be taken into consideration, 
which tends to make the analysis more difficult and 
complicated. Using existing technology, simulation 
seems to be the best approach. 

b. The analyst-interpreters examined color film copies made 
from both the high- and the normal-gain Imperial Valley 
imagery which was taken on December 19 (hazy) and 
December 20 (clear). They could detect no significant 
differences in the quality of the imagery. This supports 
the conclusion that the high-gain imagery is not superior 
to normal-gain imagery. Figures 7 and 8 are examples of 
the gray-level images of both the high- and the normal- 
gain data. 

c. In addition to its lack of improved quality, the high- 
gain data can be used only sparingly during the winter 
season with low Sun elevations and at high latitudes; 
otherwise agricultural targets will be excessively satu- 
rated (figs. 1 through 4). Since LAC IE imagery must often 
be gathered on hazy days when the average reflectance 

can increase substantially, this excessive saturation of 
agricultural sites becomes even more significant. The 
December 19 Imperial Valley data was taken under hazy 
conditions. Even though the Sun elevation was a low 26°, 
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Figure 8.- Simulated norir.al-gain data (MSS band 5, 

frame 1880-1736*1 ) . 
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the histogram in figure 9 indicates near saturation 
(127 counts) for the class of lettuce depicted by MSS 
band 5. 

d. The class divergence analysis on the Imperial Valley 

data also showed that MSS band 5 (high gain) and band 7 
(normal gain) are the most important bands for the 
classes considered. Perhaps research using the unsuper 
vised pattern recognition technique of clustering will 
be able to identify some improvement using high-gain 
data. Clustering failed when uncalibrated data were 
used, and the cluster maps showed only excessive strip- 
ing effects (fig. 10). 
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Fi 9 ure 10.- Clustering of uncalibrated data 
showing striping eftect. 
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5.0 CONCLUSION 


In summary, the maximum likelihood classifier on the 
DRIPS failed to show any improvement in accuracy when com- 
paring the high-gain Lar.dsat d«ta with the simulated normal- 
gain data. Even if an improvement in accuracy had been 
detected, the timespan within the crop growing season when 
the use of high-gain data could be advantageous is limited. 

It would seem that the high-gain data with their letter sen- 
sitivities and dynamic range* would offer some manner of 
improvement in classif ication accuracy. However, no improve- 
ment with their use has been detected as a result of this 
analysis. 

Pecause of the lack of time, a thorough study of the 
recently received calibrated data has not been undertaken. 

Such an analyses would require new procedures to identify 
certain characteristics of the clusters which are apparent 
only in high-gain data and which would indicate that the use 
of ruch data might enhance recognition accuracy. 

A total of six sets of Lundsat-1 imagery are now- 
available for the analysis of ga<n effects. Three of the 
sets are high-gain data m . V SS bands 4 and 5, whereas the 
other three are simulated normal-gain data. The four sets 
obtained over Imperial Valley have good supporting ground 
truth information for training, whereas the two sets taken 
over Kansas do not. The use of various combinations of the 
six data sets is recommended m order to uncover the possible 
advantages of using the data. 

• Method 1 — Use clustering techniques or other appropriate 
methods to determine whether or not substantial information 





that is unavailable with normal-gain data is inherent in 
high-gain data. If the conclusion is affirmative, a study 
should be made to determine how this additional information 
can best be made available for LAC IE applications. 

• Method 2 - Study the impact of hiqh-gain data on classi- 
fiers other than the maximum likelihood classifier, such 

as the single-class and the two-class classifiers currently 
being evaluated for LAC IE (ref. 2) . 

• Method 3 - Make use of the existing data sets and the 
statistical results that have been obtained in the current 
analysis (for example, the histograms and the field and 
class means and variances) in order to extend the study 
into the problem of homogeneity cf training field 
statistics . 

• Method 4 — Mete that the two sets of Imperial Valley 
high-gain data, obtained 24 hours apart, were taken under 
quite different atmospheric conditions. The December 19 
images are hazy, whereas the December 20 images are clear. 
Some readings of the atmospheric transmittances are also 
available near the test site. The two sets of data could 
be useful to those interested in the atmospheric effects 
upon signature extension and indispensable for temporal 
signature extension to those interested in the various 
techniques of signature extension. For -.xample, the data 
sets could be used immediately to test haze correction 
algorithms such as the Maximum Likelihood Estimation of 
Signature Transformation (MLEST) techniques iref. 3). 
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APPENDIX A 


MATHEMATICAL BASIS FOR REDUCING 3X DATA TO IX AND 2X DATA 

The available Land r at-1 imagery mentioned in this report 
was obtained in the high-gain (3X) mode; that is, in MSS 
bands 4 and 5, the electronic amplification at the sensor 
output was accelerated three times that in the normal-gain 
(IX) mode. In this appendix, the mathematical basis for 
reducing the high-gain data to simulated normal-gain (IX) 
and double-gain (2X) data is discussed. 

Since it is stated in the ERTS handbook that a linear 
relationship exists between the scene radiance and the data 
counts obtained from the computer-compatible tape (CCT) , the 
method of reducing the 3X data to IX is uncomplicated. The 
analog-to-digi tal conversion, data compression and decom- 
pression, and so forth are not necessary. All that is needed 
is to divide the data counts by three and truncate: 


3X 

0,1,2, ^4^5, 

6 ,7, 8 

9,10,11 

IX 

0 ’’ 1 

2 

3 

3X 

••• 123,124,125, 

126,127 


IX 

41 

42 



Thus, the saturation level of 3X data at 127 counts will 
be reduced to 42 counts in the IX data, and no data count in 
IX data will be greater than 42 counts. However, when reduc- 
ing 3X data to 2X data, an additional problem arises, as dis- 
cussed in the unpublished notes on conversion compiled by 
R. Legault of ERiM. 
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"Suppose we have 3X gain data with integer counts 
1, 2, 3, Simple conversion to 2X gain data involves 

multiplying by 2/3 and selecting an interval (truncation) 
rule such ass After multiplication by 2/3, all 3X bins with 
count strictly less than integer n but greater than or equal 
to integer count n - 1 are named 2X gain-bin count n - 1 . 
Multiplication of 3X gain bin counts by 2/3 produces a 
sequence 

0 , 2 / 3 , 1 - 1 / 3 , 2 , 2 - 2 / 3 , 3 - 1 / 3 , 4 , 4 - 2 / 3 , 5 - 1 / 3 , 6 , 6 - 2 / 3 , 7 , ••• 

2 1 2 1 2 1 2 1 

and use of the above interval (truncation) rule places either 
one or two 3X bins in a 2X bin. Consequently, a histogram of 
the 2X simulated data will exhibit the 'missing bin phenomenon' 
which will impact on classification results. 


"The figure below represents the situation interims of 
analogue signal amplitude. 

2X gain bin | n | n + 1 | n + 2 | n + 3 | 

| m |m+l|m+2|m+3|m+4| 

signal amplitude 

If the lower signal level of 2X bin n coincides with the 
lower signal level of 3X bin m (this should be true for 

m = n * 0) . Then 3X bin m + 1 lies half in 2X bin n and 

half in 2X bin n + 1 . Assuming that the signal amplitudes 
in the 3X bins are uniformly distributed, then the rule for 
creating 2X bins counts from 3X bins should be: an observa- 

tion in 3X bins m and m + 2 are assigned to 2X bins n 
and n + 1 , respectively. An observation in 3X bin m + 1 
is assigned to 2X bin n with probability 1/2 otherwise bin 
n + 1 . Another less satisfactory rule would be to assign 
the first m + 1 observation to 2X bin n , the second rr. + 1 
observation to 2X bin n+1 and so on, an alternating rule 
which puts half of the 3X gain bin m + 1 observations into 
2X gain bin n and half into 2X gain bin n + 1 . Either 

method would take some computing time to do on EHTS frame." 
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APTENDIX B 


SAMTLE DECK LAYOUT FOR T!IE 
GAIN REDUCTION PROGRAM 



DECK SETUP FOP WinaaAM Cam Reduction 


PACE NO. J.OP 
ICA) 



10 FORMAT (4F10.4) 


READ (5,10) (CON(l) ,1-1,4) 


READ (5,10) (MINU) ,1-1,4) 


READ (5,10) (MAX(l) ,1-1,4) 



7 FOR,* DATATR , DATATR 


7 XQT CUR 


7 A SC K HISFIL 


7ANR ASG C=1879 1 



7S ASG L=SAVE 


7N MSG FILE REQ. TAPE 3 FH432 0 FSTKN U 
















APPENDIX C 


PRELIMINARY ANALYSIS OF REGISTRATION 
ERROR UPON CLASSIFICATION 



APPENDIX C 


PRELIMINARY ANALYSIS OF REGISTRATION 
ERROR UPON CLASSIFICATION 

This brief writeup deals with a preliminary study of 
the effects of scene misregistration upon classif icat ion 
accuracy based on purely empirical means. In particular# the 
study gives an upper bound as to the resulting minimi rat ions 
of variations in accuracy that one can expect when ext rente 
care is taken to ensure "good" registration using Landsat data. 

C.l PROJECT DFSCRI PT ION 


Two sets of digital Iandsat-1 imagery (frames 1 $ ?•*- 1 ? 3 "*0 
and 1^80— 173t'4> were obtained '4 hours apart over Imperial 
Valley. It is understood that since the two images were 
taken from adjacent orbits "substantial" rotational misalign- 
ment exists between the images. Training and test field 
boundaries were defined on the data from frame 1880 using 
ERIP8, a* d the data from frame ltf^ were registered repeatedly 
(foui times) onto the frame 1880 data. The registered data 
sets from frame 1$?° were then classified on ERIPS using the 
same boundary as defined on the frame 18S0 data set. (The 
resulting accuracy was compared.) The following precautions 
were taken to ensure that the ERIPS hardware trouble and 
operation bias would not contribute to the registration error. 

• Tc eliminate operator and screen cursor biases, the 
Sequential Similarity Detection Algorithm (SSPA) was 
used for data correlation. It was found that because 
the two data sets were takei only a day apart r i » SSlVv 


. i 



worked very well in this situation. The first-order 
least squares fit based on the SSDA seldom gives more 
than one-pixel residuals. When a residual of more than 
one pixel occurs, the correlation point is deleted. 

• To ensure that the assignment of a reference data screen 
would contribute no registration system error, runs 1 and 
3 and runs 2 and 4 were assigned different reference 
screens . 

• To totally eliminate cursor positional error, daca magni- 
fications from 1 to 3 were used for different registra- 
tion runs. 

• To ensure that correlation points were well distributed 
over the 500-line by 510-pixel image, as many as 96 points 
were used for the SSDA and as many as "7 points were 
entered for least squares computation. 

It was anticipated chat ''perfect" registration would 
result with all the above precautions taken and that the 
four classification runs would produce identical results. 
However, this was not the outcome; the detailed results and 
some comments are presented in section C.2. 

C.2 RESULTS AND CONCLUSIONS 

The parameters under which the four registration runs 
were made are listed in table C-l. The accuracies of the 
ref.ulting classification on six major classes of crops and 
soils using identical a priori probabilities and zero thresh- 
old values are presented in table C-2. Note that only the 
relative accuracy among different runs is meaningful here. 


TABLE C-l . — REGISTRATION RUN PARAMETERS 


Parameter 

Run 

n 

1 

2 


4 

Reference screen 

2 

3 

2 

3 

Poirt magnification 

2 

2 

3 

1 

Order of correction polynomial 

1 

1 

1 

1 

Total correlation points entered 

59 

62 

77 

35 


TABLE C-2 .- CLASSIFICATION ACCURACY COMPARISON 


Class 


Run 


. - 2 - - 


3 

4 

Wheat 

BU 

83.5 

82.6 


! 

82.1 

83.5 

Cotton 

111 

81.6 

81.8 


81.0 

82.4 

Alfalfa 

1.6 

51.2 

52.2 


51.2 

52.8 

Sugar beets 

.8 

50.2 

50.6 


51.0 , 

50.6 

Lettuce 

.0 

59.9 

59.9 


59.9 

59.9 

Bare soil 

.6 

86.8 

86. / 




87.3 

i_ 

86.7 


a Indicates differences in percentages of classi 
fication accuracy between best and worst runs. 











The differences in percentage of classification accuracy 
between the best and the worst runs for all six classes ranged 
from 0 to 1.6 percent. No particular run shows a clear-cut 
advantage over all others for all classes, indicating all 
four registration runs were good but none was outstanding. 

Thus, it can be concluded that: 

e Wnen proper precautions have been taken to do image 
registration using Landsat data, a difference of about 
1 percentage point in classification accuracy cannot be 
used to indicate the degree of accuracy of the registration. 

• For a registered image size of approximately 500 by 
500 pixels, it makes no difference for the first-order 
error approximation whether 35 or 77 correlation points 
were entered for least squares computation. 

Two additional comments are made. 

• The differences in percentage of classification accuracy 
resulting from the four registration runs can be traced 
to the assignment of certain field boundary points to 
different classes. Therefore, the possibility existed 
that by assigning class thresholds to be certain values 
other than zero, the accuracies of the four runs might 
be adjusted to be more in line with each other. This 
method was tried and did not prove to be the case. 

• In connection with the adjustments discussed above, the 
accuracies of the four registration runs might be brought 
closer together if some interpolation techniques other 
than the nearest neighbor rule were used during registra- 
tion. Because of software limitations, it is not possible 
to investigate this possibility on the ERIPS at the 
present time. 
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